Data summary

# total observations
nrow(rcomm)
## [1] 2311
# total commuters
unique(rcomm$ID) %>% length()
## [1] 25
# total commutes
dplyr::select(rcomm, ID, date_local, group) %>% unique() %>% nrow()
## [1] 69
# person-days
dplyr::select(rcomm, ID, date_local) %>% unique() %>% nrow()
## [1] 45
# days
dplyr::select(rcomm, date_local) %>% unique() %>% nrow()
## [1] 38
# total minutes of obs per commuter
g1 <- group_by(rcomm, ID) %>% summarize(n=n()) %>% ungroup() %>% arrange(n)
g1 %>%
  kable()
ID n
GMU1005 24
GMU1044 24
GMU1042 25
GMU1043 31
GMU1046 33
GMU1027 34
GMU1022 36
GMU1047 37
GMU1028 40
GMU1014 47
GMU1038 50
GMU1050 50
GMU1012 54
GMU1007 84
GMU1016 94
GMU1035 95
GMU1032 98
GMU1045 119
GMU1040 129
GMU1036 139
GMU1041 167
GMU1001 199
GMU1026 209
GMU1018 226
GMU1037 267
# summary of total minutes of obs per commuter
summarize(g1, min(n), max(n), median(n), mean(n))
## # A tibble: 1 × 4
##   `min(n)` `max(n)` `median(n)` `mean(n)`
##      <int>    <int>       <int>     <dbl>
## 1       24      267          54      92.4
# commutes per commuter
dplyr::select(rcomm, ID, date_local, group) %>% unique() %>% group_by(ID) %>% count() %>%
  ungroup() %>% rename(commutesobs= n) %>% count(commutesobs) %>% 
  mutate(perc = round(100 * n/sum(n), 1)) %>% kable()
commutesobs n perc
1 10 40
2 3 12
3 3 12
4 3 12
5 4 16
6 2 8

The average commute length observed was approximately 30 minutes with commutes ranging from 15 to 99 minutes.

## # A tibble: 1 × 2
##     min   max
##   <int> <int>
## 1    15    99
ID mean min max median
GMU1001 39.80000 16 50 44.0
GMU1005 24.00000 24 24 24.0
GMU1007 21.00000 17 26 20.5
GMU1012 54.00000 54 54 54.0
GMU1014 47.00000 47 47 47.0
GMU1016 31.33333 15 45 34.0
GMU1018 37.66667 24 63 32.0
GMU1022 36.00000 36 36 36.0
GMU1026 69.66667 37 99 73.0
GMU1027 34.00000 34 34 34.0
GMU1028 20.00000 18 22 20.0
GMU1032 32.66667 21 39 38.0
GMU1035 23.75000 15 34 23.0
GMU1036 27.80000 17 42 27.0
GMU1037 44.50000 26 76 39.5
GMU1038 25.00000 19 31 25.0
GMU1040 25.80000 15 45 23.0
GMU1041 33.40000 16 58 28.0
GMU1042 25.00000 25 25 25.0
GMU1043 31.00000 31 31 31.0
GMU1044 24.00000 24 24 24.0
GMU1045 29.75000 23 40 28.0
GMU1046 16.50000 15 18 16.5
GMU1047 37.00000 37 37 37.0
GMU1050 50.00000 50 50 50.0
## # A tibble: 1 × 4
##   `mean(median)` `median(mean)` `mean(mean)` `median(median)`
##            <dbl>          <dbl>        <dbl>            <dbl>
## 1           33.3           31.3         33.6               31

PM

Number of zeroes

## # A tibble: 2 × 2
##   ID          n
##   <chr>   <int>
## 1 GMU1001    29
## 2 GMU1050     3

Histogram of PM

Histogram of PM by ID

Histogram of log(PM + 0.01) by ID

Summarize

By ID

ID lmean lsd lmin lmax mean sd min max
GMU1001 -0.04 2.05 -4.61 3.09 2.46 2.84 0.00 21.87
GMU1005 1.56 0.17 1.37 1.88 4.80 0.85 3.91 6.52
GMU1007 1.17 0.84 -1.11 2.56 4.28 2.90 0.32 12.89
GMU1012 0.68 0.15 0.47 1.06 1.98 0.32 1.59 2.88
GMU1014 1.57 0.42 0.49 2.54 5.21 2.03 1.62 12.62
GMU1016 1.79 0.58 0.84 3.92 7.42 7.48 2.30 50.22
GMU1018 0.88 0.51 -1.04 2.32 2.75 1.64 0.34 10.12
GMU1022 2.25 0.18 1.82 2.60 9.66 1.71 6.14 13.46
GMU1026 0.75 0.51 -2.19 2.20 2.37 1.20 0.10 9.03
GMU1027 1.30 0.30 0.65 2.08 3.83 1.35 1.91 8.03
GMU1028 2.09 0.37 1.30 3.09 8.63 3.41 3.67 21.97
GMU1032 0.44 0.44 -2.08 1.62 1.67 0.63 0.11 5.05
GMU1035 1.35 1.27 -0.66 3.88 9.32 14.00 0.50 48.25
GMU1036 1.45 0.66 -0.01 2.78 5.22 3.41 0.98 16.05
GMU1037 1.41 0.73 0.25 3.17 5.61 5.49 1.28 23.73
GMU1038 1.61 0.74 1.04 3.89 7.49 10.18 2.81 48.72
GMU1040 0.70 0.27 0.30 2.06 2.09 0.78 1.34 7.87
GMU1041 1.71 0.61 1.05 2.90 6.63 4.05 2.84 18.07
GMU1042 1.79 0.29 1.49 2.36 6.23 1.88 4.41 10.54
GMU1043 0.60 0.27 -0.02 1.11 1.88 0.50 0.97 3.04
GMU1044 2.91 0.15 2.63 3.15 18.60 2.84 13.84 23.28
GMU1045 0.14 0.24 -0.12 1.03 1.17 0.33 0.88 2.79
GMU1046 2.35 0.22 2.00 2.85 10.76 2.44 7.36 17.25
GMU1047 1.58 0.35 1.10 2.25 5.16 1.89 3.00 9.45
GMU1050 0.74 1.46 -4.61 2.62 3.24 2.41 0.00 13.79

Across IDs

The average minute PM2.5 across participants was 5.5 mug/m3 ranging from 0 to 50.2 mug/m3. In general, there was greater variability between participants than within a participant over the commutes, though variability was great for some participants.

name med mean sd
lmean 1.41 1.31 0.71
mean 5.16 5.54 3.86
sd 2.03 3.06 3.22
## # A tibble: 1 × 2
##     min   max
##   <dbl> <dbl>
## 1     0  50.2

Violin plots

Box plots

Commute summaries

Mean by commute

SD by commute

Roadiness/Road type/Speed

  1. Possible misclassification

Most observations (N=1215, 52.6%) were for local roads. 600 observations (26%) were on highways and 399 (17.3%) were on local connecting roads which are XX. The remainder (N=97, 4.2%) were on ramps, tunnels, or others.

## # A tibble: 4 × 3
##   rtype            n  perc
##   <fct>        <int> <dbl>
## 1 High/SecHigh   600  26  
## 2 LocalConn      399  17.3
## 3 Local         1215  52.6
## 4 Other           97   4.2

Not much difference if take mode over commute type

rtypeMode n perc
High/SecHigh 17 24.6
LocalConn 11 15.9
Local 40 58.0
Other 1 1.4

Roadiness

Reporting values is not so useful (standardized)

mean sd min max
0 1 -4.59 2.54

ID mean sd min max
GMU1001 -0.72 0.73 -2.14 0.62
GMU1005 0.58 0.25 0.26 0.92
GMU1007 -0.35 0.42 -1.08 0.30
GMU1012 -0.78 0.56 -1.61 0.30
GMU1014 0.31 0.17 -0.19 0.43
GMU1016 0.55 0.20 0.17 0.92
GMU1018 -0.52 0.72 -1.97 0.30
GMU1022 -0.12 0.51 -1.00 0.62
GMU1026 -0.34 1.96 -4.59 2.54
GMU1027 0.06 0.37 -0.90 0.36
GMU1028 -0.78 0.61 -1.53 0.44
GMU1032 -0.67 0.55 -1.56 0.37
GMU1035 0.20 0.80 -1.70 2.35
GMU1036 0.23 0.48 -0.84 0.96
GMU1037 0.11 0.69 -2.01 1.08
GMU1038 -0.27 0.49 -1.01 0.45
GMU1040 0.78 0.54 0.09 1.80
GMU1041 0.78 0.55 -0.28 2.06
GMU1042 -1.27 0.89 -2.41 0.40
GMU1043 -0.22 0.60 -0.88 0.70
GMU1044 0.59 0.16 0.30 0.92
GMU1045 1.13 0.66 -0.02 2.27
GMU1046 -0.35 0.44 -1.34 0.30
GMU1047 0.27 0.54 -1.06 0.62
GMU1050 0.30 0.70 -1.21 1.80

Speed

mean sd min max med IQR
24.48 21.38 0 93.76 20.41 32.94
ID mean sd min max med IQR
GMU1001 23.62 16.71 0.00 62.53 22.12 27.23
GMU1005 25.46 24.83 0.00 72.42 14.08 32.81
GMU1007 20.32 14.75 0.00 50.16 18.03 24.06
GMU1012 17.74 16.92 0.00 67.74 13.66 19.73
GMU1014 6.28 13.26 0.00 48.15 0.00 1.82
GMU1016 19.43 16.11 0.00 55.89 18.61 24.99
GMU1018 20.88 18.00 0.00 58.12 18.47 32.95
GMU1022 20.85 16.04 0.00 49.46 22.91 32.34
GMU1026 48.38 28.47 0.00 93.76 57.91 52.99
GMU1027 22.57 23.61 0.00 79.88 18.62 42.41
GMU1028 42.06 21.12 1.97 68.49 48.74 33.77
GMU1032 22.13 17.35 0.00 77.89 19.49 27.17
GMU1035 30.24 21.82 0.00 71.64 31.86 40.30
GMU1036 18.19 15.28 0.00 63.86 17.51 28.57
GMU1037 20.26 15.29 0.00 57.83 16.66 23.25
GMU1038 23.47 18.68 0.00 60.42 17.41 29.75
GMU1040 14.09 13.31 0.00 76.30 10.62 23.89
GMU1041 17.41 18.62 0.00 62.58 11.64 29.67
GMU1042 51.88 19.37 0.46 70.07 62.75 26.64
GMU1043 18.34 17.26 0.12 52.43 16.73 30.74
GMU1044 29.52 15.80 3.65 63.16 29.47 18.25
GMU1045 32.93 18.72 0.00 74.55 34.10 29.79
GMU1046 18.57 20.65 0.00 91.91 11.71 18.86
GMU1047 14.67 20.42 0.00 56.48 0.00 29.42
GMU1050 38.35 27.83 0.00 78.38 34.94 51.36

## # A tibble: 69 × 3
## # Groups:   ID [25]
##    ID      id2                 maxt
##    <chr>   <fct>              <dbl>
##  1 GMU1026 GMU10262018-11-270    98
##  2 GMU1037 GMU10372019-02-041    75
##  3 GMU1026 GMU10262018-11-290    72
##  4 GMU1018 GMU10182018-11-080    62
##  5 GMU1041 GMU10412019-02-150    57
##  6 GMU1012 GMU10122018-10-180    53
##  7 GMU1001 GMU10012018-05-091    49
##  8 GMU1050 GMU10502019-03-130    49
##  9 GMU1001 GMU10012018-05-080    48
## 10 GMU1037 GMU10372019-02-051    47
## # … with 59 more rows

Weather

L1 variables: I created as 1 day lag

Variables

https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt

  • PRCP = Precipitation (tenths of mm)
  • SNOW = Snowfall (mm)
  • TMAX = Maximum temperature (degrees C, original tenths of degrees C)
  • TMIN = Minimum temperature (degrees C, original tenths of degrees C)
  • AWND = Average daily wind speed (m/s, original tenths of meters per second)
  • wdf2, wdf5 direction of fastest wind (2 vs. 5 minutes) (degrees)
  • cat2, cat5 categorical direction of fastest wind (2 vs. 5 minutes)

EDA

name mean sd min max
awnd 3.76 1.67 1.00 9.10
awndL1 3.41 1.83 1.00 9.10
awndL1m 3.69 1.55 1.45 8.70
prcp 29.85 54.79 0.00 204.23
prcpL1 45.31 108.38 0.00 645.98
prcpL1m 28.48 71.22 0.00 328.49
RH 68.02 18.42 22.09 99.89
snow 2.94 10.76 0.00 61.09
snowL1 1.18 6.15 0.00 40.56
snowL1m 2.85 8.68 0.00 34.16
tavg 8.32 7.36 -4.97 24.55
tmax 13.47 8.03 2.15 29.22
tmaxL1 12.83 7.60 2.60 28.77
tmaxL1m 12.96 8.43 -2.02 30.24
tmin 3.17 7.24 -12.10 21.05
tminL1 3.06 6.91 -6.72 21.05
tminL1m 2.54 7.93 -15.80 21.05

Snow and precipation: binary

Variable Value NPerc
Precipitation None 18 (40)
Precipitation Some 27 (60)
Snow None 40 (88.9)
Snow Some 5 (11.1)

By observation

NOTE: Don’t use– this is by observation and not for person-days

name mean sd min max
awnd 3.51 1.50 1.00 9.10
awndL1 3.39 1.86 1.00 9.10
awndL1m 3.53 1.49 1.45 8.70
group 0.41 0.60 0.00 2.00
prcp 35.16 61.77 0.00 204.23
prcpL1 46.07 95.39 0.00 645.98
prcpL1m 27.03 64.48 0.00 328.49
snow 2.03 8.80 0.00 61.09
snowL1 0.84 5.01 0.00 40.56
snowL1m 2.67 8.13 0.00 34.16
tavg 7.89 6.97 -4.97 24.55
tmax 12.96 7.79 2.15 29.22
tmaxL1 12.19 7.38 2.60 28.77
tmaxL1m 12.79 8.22 -2.02 30.24
tmin 2.82 6.82 -12.10 21.05
tminL1 2.70 6.76 -6.72 21.05
tminL1m 2.27 7.51 -15.80 21.05

Snow and precipation: binary

name value n perc
prcpbin 0 29 42.0
prcpbin 1 40 58.0
prcpbinL1 0 25 36.2
prcpbinL1 1 44 63.8
prcpbinL1m 0 17 24.6
prcpbinL1m 1 52 75.4
snowbin 0 63 91.3
snowbin 1 6 8.7
snowbinL1 0 65 94.2
snowbinL1 1 4 5.8
snowbinL1m 0 59 85.5
snowbinL1m 1 10 14.5

Wind direction

  • Wind dir: 5

  • Wind dir: 2
cat2sm n perc
SE 15 33.3
NW 22 48.9
Other 8 17.8

compare commutes

Possible multiple obs of same commute:

Daily PM

## # A tibble: 1 × 4
##   `mean(daily)` `sd(daily)` `min(daily)` `max(daily)`
##           <dbl>       <dbl>        <dbl>        <dbl>
## 1          6.71        3.98          1.5         21.7

Participant characteristics

## # A tibble: 1 × 4
##   `mean(age)` `sd(age)` `min(age)` `max(age)`
##         <dbl>     <dbl>      <int>      <int>
## 1        26.4      8.03         18         46
Variable Value NPerc
Race Asian 11 (44)
Race White only 9 (36)
Race Other/Did not specify 5 (20)
Ethnicity Hispanic or Latino 2 (8)
Ethnicity Not Hispanic or Latino 23 (92)
Employed Part-time 7 (28)
Employed Full-time 17 (68)
Education High school diploma or GED 6 (24)
Education Some college or technical school 7 (28)
Education College degree or technical school degree 5 (20)
Education Some graduate school 2 (8)
Education Graduate school degree or post-graduate degree 5 (20)
GMUstudent Yes 15 (60)
GMUstudent No 9 (36)
Children None 18 (72)
Children 1+ 7 (28)